https://rss.onlinelibrary.wiley.com/doi/10.1111/j.1740-9713.2018.01169.x
http://www.youtube.com/watch?v=XcBLEVknqvY
https://www.rstudio.com/products/rstudio/download/
https://moderndive.com/2-getting-started.html
https://cran.r-project.org/web/packages/addinslist/README.html
https://rstudio.github.io/rstudioaddins/
Aynı şeyi çok fazla şekilde yapmak mümkün
R Syntax Comparison::CHEAT SHEET
https://www.amelia.mn/Syntax-cheatsheet.pdf
#RStats — There are always several ways to do the same thing… nice example on with the identity matrix by @TeaStats https://t.co/O3GXdPiM32
— Colin Fay 🤘 (@_ColinFay) April 1, 2019
I love the #rstats community.
— Frank Elavsky ᴰᵃᵗᵃ ᵂᶦᶻᵃʳᵈ (@Frankly_Data) July 3, 2018
Someone is like, “oh hey peeps, I saw a big need for this mundane but difficult task that I infrequently do, so I created a package that will literally scrape the last bits of peanut butter out of the jar for you. It's called pbplyr.”
What a tribe.
https://blog.mitchelloharawild.com/blog/user-2018-feature-wall/
Available CRAN Packages By Name
https://cran.r-project.org/web/packages/available_packages_by_name.html
CRAN Task Views
https://cran.r-project.org/web/views/
Bioconductor
https://www.bioconductor.org
RecommendR
http://recommendr.info/
pkgsearch
CRAN package search
https://github.com/metacran/pkgsearch
CRANsearcher
https://github.com/RhoInc/CRANsearcher
Awesome R
https://awesome-r.com/
install.packages("tidyverse", dependencies = TRUE)
install.packages("jmv", dependencies = TRUE)
install.packages("questionr", dependencies = TRUE)
install.packages("Rcmdr", dependencies = TRUE)
install.packages("summarytools")
# install.packages("tidyverse", dependencies = TRUE)
# install.packages("jmv", dependencies = TRUE)
# install.packages("questionr", dependencies = TRUE)
# install.packages("Rcmdr", dependencies = TRUE)
# install.packages("summarytools")# require(tidyverse)
# require(jmv)
# require(questionr)
# library(summarytools)
# library(gganimate)RDocumentation https://www.rdocumentation.org
R Package Documentation https://rdrr.io/
GitHub
Stackoverflow
How I use #rstats
— Emily Bovee (@ebovee09) August 10, 2018
h/t @ThePracticalDev pic.twitter.com/erRnTG0Ujr
[R] yazmak da işe yarayabiliyor.http://cran.r-project.org/doc/contrib/Baggott-refcard-v2.pdf
https://www.rstudio.com/resources/cheatsheets/
https://github.com/qinwf/awesome-R#readme
https://twitter.com/hashtag/rstats?src=hash
Got a question to ask on @SlackHQ or post on @github? No time to read the long post on how to use reprex? Here is a 20-second gif for you to format your R codes nicely and for others to reproduce your problem. (An example from a talk given by @JennyBryan) #rstat pic.twitter.com/gpuGXpFIsX
— ZhiYang (@zhiiiyang) October 18, 2018
https://support.rstudio.com/hc/en-us/articles/200526207-Using-Projects
https://support.rstudio.com/hc/en-us/articles/218611977-Importing-Data-with-RStudio
Spreadsheet users using #rstats: where's the data?#rstats users using spreadsheets: where's the code?
— Leonard Kiefer (@lenkiefer) July 7, 2018
View(data)
data
head
tail
glimpse
str
skimr::skim()
questionr paketi kullanılacak
https://juba.github.io/questionr/articles/recoding_addins.html
summary()
mean
median
min
max
sd
table()
Parsed with column specification:
cols(
Sepal.Length = [32mcol_double()[39m,
Sepal.Width = [32mcol_double()[39m,
Petal.Length = [32mcol_double()[39m,
Petal.Width = [32mcol_double()[39m,
Species = [31mcol_character()[39m
)
jmv::descriptives(
data = irisdata,
vars = "Sepal.Length",
splitBy = "Species",
freq = TRUE,
hist = TRUE,
dens = TRUE,
bar = TRUE,
box = TRUE,
violin = TRUE,
dot = TRUE,
mode = TRUE,
sum = TRUE,
sd = TRUE,
variance = TRUE,
range = TRUE,
se = TRUE,
skew = TRUE,
kurt = TRUE,
quart = TRUE,
pcEqGr = TRUE)
DESCRIPTIVES
Descriptives
─────────────────────────────────────────────────────
Species Sepal.Length
─────────────────────────────────────────────────────
N setosa 50
versicolor 50
virginica 50
Missing setosa 0
versicolor 0
virginica 0
Mean setosa 5.01
versicolor 5.94
virginica 6.59
Std. error mean setosa 0.0498
versicolor 0.0730
virginica 0.0899
Median setosa 5.00
versicolor 5.90
virginica 6.50
Mode setosa 5.00
versicolor 5.50
virginica 6.30
Sum setosa 250
versicolor 297
virginica 329
Standard deviation setosa 0.352
versicolor 0.516
virginica 0.636
Variance setosa 0.124
versicolor 0.266
virginica 0.404
Range setosa 1.50
versicolor 2.10
virginica 3.00
Minimum setosa 4.30
versicolor 4.90
virginica 4.90
Maximum setosa 5.80
versicolor 7.00
virginica 7.90
Skewness setosa 0.120
versicolor 0.105
virginica 0.118
Std. error skewness setosa 0.337
versicolor 0.337
virginica 0.337
Kurtosis setosa -0.253
versicolor -0.533
virginica 0.0329
Std. error kurtosis setosa 0.662
versicolor 0.662
virginica 0.662
25th percentile setosa 4.80
versicolor 5.60
virginica 6.23
50th percentile setosa 5.00
versicolor 5.90
virginica 6.50
75th percentile setosa 5.20
versicolor 6.30
virginica 6.90
─────────────────────────────────────────────────────
# install.packages("scatr")
scatr::scat(
data = irisdata,
x = "Sepal.Length",
y = "Sepal.Width",
group = "Species",
marg = "dens",
line = "linear",
se = TRUE)https://cran.r-project.org/web/packages/summarytools/vignettes/Introduction.html
Type: Factor
| Freq | % Valid | % Valid Cum. | % Total | % Total Cum. | |
|---|---|---|---|---|---|
| setosa | 50 | 33.33 | 33.33 | 33.33 | 33.33 |
| versicolor | 50 | 33.33 | 66.67 | 33.33 | 66.67 |
| virginica | 50 | 33.33 | 100.00 | 33.33 | 100.00 |
| <NA> | 0 | 0.00 | 100.00 | ||
| Total | 150 | 100.00 | 100.00 | 100.00 | 100.00 |
‘omit.headings’ argument has been replaced by ‘headings’; setting headings = FALSE
| Freq | % | % Cum. | |
|---|---|---|---|
| setosa | 50 | 33.33 | 33.33 |
| versicolor | 50 | 33.33 | 66.67 |
| virginica | 50 | 33.33 | 100.00 |
| Total | 150 | 100.00 | 100.00 |
| diseased | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| smoker | Yes | No | Total | |||||||||
| Yes | 125 | ( | 41.9% | ) | 173 | ( | 58.1% | ) | 298 | ( | 100.0% | ) |
| No | 99 | ( | 14.1% | ) | 603 | ( | 85.9% | ) | 702 | ( | 100.0% | ) |
| Total | 224 | ( | 22.4% | ) | 776 | ( | 77.6% | ) | 1000 | ( | 100.0% | ) |
Generated by summarytools 0.9.3 (R version 3.5.3)
2019-04-18
with(tobacco,
print(ctable(smoker, diseased, prop = 'n', totals = FALSE),
omit.headings = TRUE, method = "render"))‘omit.headings’ will disappear in future releases; use ‘headings’ instead
| diseased | ||
|---|---|---|
| smoker | Yes | No |
| Yes | 125 | 173 |
| No | 99 | 603 |
Generated by summarytools 0.9.3 (R version 3.5.3)
2019-04-18
Non-numerical variable(s) ignored: Species ### Descriptive Statistics
#### iris
N: 150
| Petal.Length | Petal.Width | Sepal.Length | Sepal.Width | |
|---|---|---|---|---|
| Mean | 3.76 | 1.20 | 5.84 | 3.06 |
| Std.Dev | 1.77 | 0.76 | 0.83 | 0.44 |
| Min | 1.00 | 0.10 | 4.30 | 2.00 |
| Q1 | 1.60 | 0.30 | 5.10 | 2.80 |
| Median | 4.35 | 1.30 | 5.80 | 3.00 |
| Q3 | 5.10 | 1.80 | 6.40 | 3.30 |
| Max | 6.90 | 2.50 | 7.90 | 4.40 |
| MAD | 1.85 | 1.04 | 1.04 | 0.44 |
| IQR | 3.50 | 1.50 | 1.30 | 0.50 |
| CV | 0.47 | 0.64 | 0.14 | 0.14 |
| Skewness | -0.27 | -0.10 | 0.31 | 0.31 |
| SE.Skewness | 0.20 | 0.20 | 0.20 | 0.20 |
| Kurtosis | -1.42 | -1.36 | -0.61 | 0.14 |
| N.Valid | 150.00 | 150.00 | 150.00 | 150.00 |
| Pct.Valid | 100.00 | 100.00 | 100.00 | 100.00 |
descr(iris, stats = c("mean", "sd", "min", "med", "max"), transpose = TRUE,
omit.headings = TRUE, style = "rmarkdown")‘omit.headings’ argument has been replaced by ‘headings’; setting headings = FALSE Non-numerical variable(s) ignored: Species
| Mean | Std.Dev | Min | Median | Max | |
|---|---|---|---|---|---|
| Petal.Length | 3.76 | 1.77 | 1.00 | 4.35 | 6.90 |
| Petal.Width | 1.20 | 0.76 | 0.10 | 1.30 | 2.50 |
| Sepal.Length | 5.84 | 0.83 | 4.30 | 5.80 | 7.90 |
| Sepal.Width | 3.06 | 0.44 | 2.00 | 3.00 | 4.40 |
text graphs are displayed; set ‘tmp.img.dir’ parameter to activate png graphs NAs introduced by coercion### Data Frame Summary
#### tobacco
Dimensions: 1000 x 9
Duplicates: 2
| No | Variable | Stats / Values | Freqs (% of Valid) | Graph | Valid | Missing |
|---|---|---|---|---|---|---|
| 1 | gender [factor] |
1. F 2. M |
489 (50.0%) 489 (50.0%) |
IIIIIIIIII IIIIIIIIII |
978 (97.8%) |
22 (2.2%) |
| 2 | age [numeric] |
Mean (sd) : 49.6 (18.3) min < med < max: 18 < 50 < 80 IQR (CV) : 32 (0.4) |
63 distinct values | . . . . . : : : : : : . : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : : |
975 (97.5%) |
25 (2.5%) |
| 3 | age.gr [factor] |
1. 18-34 2. 35-50 3. 51-70 4. 71 + |
258 (26.5%) 241 (24.7%) 317 (32.5%) 159 (16.3%) |
IIIII IIII IIIIII III |
975 (97.5%) |
25 (2.5%) |
| 4 | BMI [numeric] |
Mean (sd) : 25.7 (4.5) min < med < max: 8.8 < 25.6 < 39.4 IQR (CV) : 5.7 (0.2) |
974 distinct values | : : : : : : : : : : : : . : : : : : . |
974 (97.4%) |
26 (2.6%) |
| 5 | smoker [factor] |
1. Yes 2. No |
298 (29.8%) 702 (70.2%) |
IIIII IIIIIIIIIIIIII |
1000 (100%) |
0 (0%) |
| 6 | cigs.per.day [numeric] |
Mean (sd) : 6.8 (11.9) min < med < max: 0 < 0 < 40 IQR (CV) : 11 (1.8) |
37 distinct values | : : : : : . . . . . . |
965 (96.5%) |
35 (3.5%) |
| 7 | diseased [factor] |
1. Yes 2. No |
224 (22.4%) 776 (77.6%) |
IIII IIIIIIIIIIIIIII |
1000 (100%) |
0 (0%) |
| 8 | disease [character] |
1. Hypertension 2. Cancer 3. Cholesterol 4. Heart 5. Pulmonary 6. Musculoskeletal 7. Diabetes 8. Hearing 9. Digestive 10. Hypotension [ 3 others ] |
36 (16.2%) 34 (15.3%) 21 ( 9.5%) 20 ( 9.0%) 20 ( 9.0%) 19 ( 8.6%) 14 ( 6.3%) 14 ( 6.3%) 12 ( 5.4%) 11 ( 5.0%) 21 ( 9.5%) |
III III I I I I I I I I |
222 (22.2%) |
778 (77.8%) |
| 9 | samp.wgts [numeric] |
Mean (sd) : 1 (0.1) min < med < max: 0.9 < 1 < 1.1 IQR (CV) : 0.2 (0.1) |
0.86!: 267 (26.7%) 1.04!: 249 (24.9%) 1.05!: 324 (32.4%) 1.06!: 160 (16.0%) ! rounded |
IIIII IIII IIIIII III |
1000 (100%) |
0 (0%) |
# First save the results
iris_stats_by_species <- by(data = iris,
INDICES = iris$Species,
FUN = descr, stats = c("mean", "sd", "min", "med", "max"),
transpose = TRUE)
# Then use view(), like so:
view(iris_stats_by_species, method = "pander", style = "rmarkdown")Non-numerical variable(s) ignored: Species ### Descriptive Statistics
#### iris
Group: Species = setosa
N: 50
| Mean | Std.Dev | Min | Median | Max | |
|---|---|---|---|---|---|
| Petal.Length | 1.46 | 0.17 | 1.00 | 1.50 | 1.90 |
| Petal.Width | 0.25 | 0.11 | 0.10 | 0.20 | 0.60 |
| Sepal.Length | 5.01 | 0.35 | 4.30 | 5.00 | 5.80 |
| Sepal.Width | 3.43 | 0.38 | 2.30 | 3.40 | 4.40 |
Group: Species = versicolor
N: 50
| Mean | Std.Dev | Min | Median | Max | |
|---|---|---|---|---|---|
| Petal.Length | 4.26 | 0.47 | 3.00 | 4.35 | 5.10 |
| Petal.Width | 1.33 | 0.20 | 1.00 | 1.30 | 1.80 |
| Sepal.Length | 5.94 | 0.52 | 4.90 | 5.90 | 7.00 |
| Sepal.Width | 2.77 | 0.31 | 2.00 | 2.80 | 3.40 |
Group: Species = virginica
N: 50
| Mean | Std.Dev | Min | Median | Max | |
|---|---|---|---|---|---|
| Petal.Length | 5.55 | 0.55 | 4.50 | 5.55 | 6.90 |
| Petal.Width | 2.03 | 0.27 | 1.40 | 2.00 | 2.50 |
| Sepal.Length | 6.59 | 0.64 | 4.90 | 6.50 | 7.90 |
| Sepal.Width | 2.97 | 0.32 | 2.20 | 3.00 | 3.80 |
Data Frame: tobacco
N: 258
| 18-34 | 35-50 | 51-70 | 71 + | |
|---|---|---|---|---|
| Mean | 23.84 | 25.11 | 26.91 | 27.45 |
| Std.Dev | 4.23 | 4.34 | 4.26 | 4.37 |
| Min | 8.83 | 10.35 | 9.01 | 16.36 |
| Median | 24.04 | 25.11 | 26.77 | 27.52 |
| Max | 34.84 | 39.44 | 39.21 | 38.37 |
BMI_by_age <- with(tobacco,
by(BMI, age.gr, descr, transpose = TRUE,
stats = c("mean", "sd", "min", "med", "max")))
view(BMI_by_age, "pander", style = "rmarkdown", omit.headings = TRUE)‘omit.headings’ will disappear in future releases; use ‘headings’ instead
| Mean | Std.Dev | Min | Median | Max | |
|---|---|---|---|---|---|
| 18-34 | 23.84 | 4.23 | 8.83 | 24.04 | 34.84 |
| 35-50 | 25.11 | 4.34 | 10.35 | 25.11 | 39.44 |
| 51-70 | 26.91 | 4.26 | 9.01 | 26.77 | 39.21 |
| 71 + | 27.45 | 4.37 | 16.36 | 27.52 | 38.37 |
tobacco_subset <- tobacco[ ,c("gender", "age.gr", "smoker")]
freq_tables <- lapply(tobacco_subset, freq)
# view(freq_tables, footnote = NA, file = 'freq-tables.html')function ‘is’ appears not to be S3 generic; found functions that look like S3 methods‘>=’ not meaningful for factors$properties
$attributes.lengths names class row.names 5 1 150
$extensive.is [1] “is.data.frame” “is.list” “is.object”
[4] “is.recursive” “is.unsorted”
### Frequencies
#### tobacco$gender
**Type:** Factor
| | Freq | % Valid | % Valid Cum. | % Total | % Total Cum. |
|-----------:|-----:|--------:|-------------:|--------:|-------------:|
| **F** | 489 | 50.00 | 50.00 | 48.90 | 48.90 |
| **M** | 489 | 50.00 | 100.00 | 48.90 | 97.80 |
| **\<NA\>** | 22 | | | 2.20 | 100.00 |
| **Total** | 1000 | 100.00 | 100.00 | 100.00 | 100.00 |
| Valid | Total | ||||
|---|---|---|---|---|---|
| gender | Freq | % | % Cum. | % | % Cum. |
| F | 489 | 50.00 | 50.00 | 48.90 | 48.90 |
| M | 489 | 50.00 | 100.00 | 48.90 | 97.80 |
| <NA> | 22 | 2.20 | 100.00 | ||
| Total | 1000 | 100.00 | 100.00 | 100.00 | 100.00 |
Generated by summarytools 0.9.3 (R version 3.5.3)
2019-04-18
library(skimr)
skim(df)